SPRITE: improving spatial gene expression imputation with gene and cell networks

Abstract Motivation Spatially resolved single-cell transcriptomics have provided unprecedented insights into gene expression in situ, particularly in the context of cell interactions or organization of tissues. However, current technologies for profiling spatial gene expression at single-cell resolution are generally limited to the measurement of a small number of genes. To address this limitation, several algorithms have been developed to impute or predict the expression of additional genes that were not present in the measured gene panel. Current algorithms do not leverage the rich spatial and gene relational information in spatial transcriptomics. To improve spatial gene expression predictions, we introduce Spatial Propagation and Reinforcement of Imputed Transcript Expression (SPRITE) as a meta-algorithm that processes predictions obtained from existing methods by propagating information across gene correlation networks and spatial neighborhood graphs. Results SPRITE improves spatial gene expression predictions across multiple spatial transcriptomics datasets. Furthermore, SPRITE predicted spatial gene expression leads to improved clustering, visualization, and classification of cells. SPRITE can be used in spatial transcriptomics data analysis to improve inferences based on predicted gene expression. Availability and implementation The SPRITE software package is available at https://github.com/sunericd/SPRITE. Code for generating experiments and analyses in the manuscript is available at https://github.com/sunericd/sprite-figures-and-analyses.


Introduction
The advent of spatially resolved single-cell transcriptomics have provided new opportunities to study the biological processes governing cellular interactions and the organization of tissues in situ (Moses and Pachter 2022).Although these spatial transcriptomics methods can detect transcripts at singlecell resolution, they are generally limited to the measurement of a small number of genes (Li et al. 2022).Due to the resource-intensive nature of current spatial transcriptomics technologies, it is often infeasible to measure additional genes in a spatial transcriptomics experiment and as such, computational methods for predicting the spatial expression of additional genes of interest are desirable.
Several computational methods have been developed to impute or predict spatial gene expression for spatial transcriptomics datasets by leveraging whole-transcriptome gene expression information from paired single-cell RNAseq datasets.These approaches typically involve jointly embedding the spatial and RNA-seq data.After this joint embedding, these methods then predict the expression of new genes by aggregating information across neighboring cells in the RNAseq data (Welch et al. 2019, Abdelaal et al. 2020, Shengquan et al. 2021, Allen et al. 2023).Prediction methods employing other methods, optimal transport for example (Biancalani et al. 2021), have also been proposed.In virtually all cases, important relational information such as that between genes (e.g. co-expression) and that between cells (e.g.spatial proximity) are not explicitly utilized in predicting spatial gene expression.Incorporating additional relational information in these prediction methods may provide one avenue to improve upon the current state-of-the-art in prediction of spatial gene expression.More accurate predictions of spatial gene expression are desirable and can likely improve the quality of downstream inference that rely on these predictions.
Here, we develop and introduce Spatial Propagation and Reinforcement of Imputed Transcript Expression (SPRITE), a meta-algorithm that works with any existing spatial gene expression prediction method.SPRITE employs a two-step approach that centers around information propagation in gene correlation networks and spatial neighborhood graphs to refine the baseline predicted spatial gene expression obtained from an existing method.We show that post-processing of spatial gene expression predictions with SPRITE generally leads to more accurate predictions and that these improvements translate to improved performance on common downstream analysis tasks for spatial transcriptomics such as clustering, visualization, and classification of different cell populations.

Benchmark datasets
We evaluated the performance of the SPRITE across eleven benchmark spatial transcriptomics and RNAseq dataset pairs (Shah et al. 2016, Karaiskos et al. 2017, Codeluppi et al. 2018, Tasic et al. 2018, Wang et al. 2018, Hodge et al. 2019, Nitzan et al. 2019, Xia et al. 2019, Gyllborg et al. 2020, Zhou et al. 2020, Alon et al. 2021, Booeshaghi et al. 2021, Joglekar et al. 2021, Yao et al. 2021, Li et al. 2022, Lohoff et al. 2022, Lust et al. 2022, Wei et al. 2022, Long et al. 2023) that were compiled and processed by earlier studies (Li et al. 2022, Sun et al. 2024), which is where they can be accessed.The spatial transcriptomics and RNAseq dataset pairs were chosen from the same species and approximately the same tissues.The dataset pairs spanned four organisms (human, mouse, fruit fly, axolotl); eight spatially resolved single-cell transcriptomics technologies, multiple single-cell RNAseq technologies, and multiple tissues.Additional details and properties of these datasets are available in Supplementary Table S1.Before spatial gene expression prediction, the counts in the RNAseq datasets were normalized such that the total count for each was equal to the median of total counts across cells, and then the normalized counts were log-transformed with an added pseudocount.

Spatial gene expression prediction
As a meta-algorithm, SPRITE uses the baseline predicted expression obtained from an existing spatial gene expression prediction method.We evaluated SPRITE on three spatial gene expression prediction methods: SpaGE (Abdelaal et al. 2020), Tangram (Biancalani et al. 2021), and Harmony-kNN (Korsunsky et al. 2019).SpaGE performs alignment using domain adaptation and subsequent prediction by k-nearest neighbors regression (Abdelaal et al. 2020).For each dataset, we used either the top 20 or the top half of the principal vectors for SpaGE prediction, whichever provided a larger number of principal vectors.Tangram maps RNAseq expression onto space using a deep learning approach.We performed additional preprocessing steps for the input data according to the suggested approach in Tangram (Biancalani et al. 2021).Harmony-kNN uses Harmony (Korsunsky et al. 2019) to jointly embed the spatial transcriptomics and RNAseq data and then predicts expression for a cell in the spatial transcriptomics data by averaging gene expression from its k nearest neighbors in the RNAseq data.We used k ¼ 10 and up to the first 30 harmonized principal components.For all models, the hyperparameter choices were made according to either the default settings or settings used in similar benchmarking contexts (Li et al. 2022, Sun et al. 2024).

SPRITE meta-algorithm
The inputs to the SPRITE meta-algorithm are paired singlecell datasets from spatial transcriptomics and RNAseq and a spatial gene expression prediction method (see above section for examples).The goal of SPRITE is to generate predictions of spatial gene expression for a set of genes that are not seen in the spatial transcriptomics data but are present in the RNAseq data.We denote the spatial transcriptomics data X spatial 2 R n×p and the RNAseq data X rna 2 R m×q .In both matrices, the rows correspond to cells and the columns correspond to genes.In most cases where spatial gene expression prediction is desirable, it is true that q � p (i.e.there are many times more genes in the RNAseq data as in the spatial transcriptomics data) and the genes in X rna superset the genes in X spatial .In practice, the latter attribute can be enforced by subsetting the spatial transcriptomics data to only include genes also present in the RNAseq data.
The spatial gene expression prediction problem involves predicting the expression of a gene indexed by j in X rna that is not present in X spatial .Using a spatial gene expression prediction method, the expression of gene j for each cell in X spatial is predicted using information jointly obtained from X spatial and X rna .We denote the matrix containing the predicted gene expression for r genes as G 2 R n×r .
The spatial gene expression prediction workflow including the SPRITE meta-algorithm and its steps are depicted in Fig. 1.Given a black-box gene expression prediction method (e.g.SpaGE, Tangram, or Harmony-kNN) and paired datasets X spatial and X rna , we generate predicted expression for a set of "target" genes that are present in X rna but not in X spatial and also a set of "calibration" genes that are present in both X rna and X spatial .The calibration genes are used to estimate prediction errors.To predict calibration gene expression without exposing the prediction method to the measured expression of these genes, we used a 10-fold cross-validation approach where for each fold, we excluded approximately 10% of calibration genes and predicted their expression using the remaining genes.This procedure is repeated until predictions are made for all calibration genes in the spatial transcriptomics data.After combining the predicted expression of the target genes and calibration genes into a single matrix G, SPRITE then performs a two-step post-processing procedure to improve upon the initial predictions.First, SPRITE reinforces the prediction errors generated using the calibration genes by propagating these errors across a gene correlation network to correct predictions for target gene expression.We refer to this first step as the "Reinforce" step.Next, the predicted gene expression is smoothed across all cells in a spatial neighborhood graph such that neighboring cells with similar measured gene expression (i.e. of the same cell type or subtype) will tend to have more similar predicted gene expression.We refer to this second step as the "Smooth" step.Finally, the SPRITE predicted spatial gene expression can be used in downstream analysis tasks that rely on gene expression data.

Reinforce step with gene correlation network
In the Reinforce step, we use a modified version of the iterative smoothing procedure with guaranteed convergence (Zhou et al. 2003, Huang et al. 2021).Specifically, the update rule for Reinforce is: (1) In Equation ( 1), E ð0Þ ¼ E is the initial residuals matrix containing the prediction errors, E ðtÞ is the residuals matrix after the tth update, S gene is a normalized adjacency matrix from a gene correlation network, and α r is the smoothing parameter for Reinforce.The residuals matrix is computed as E ¼ ðY−GÞ T , where G is the predicted gene expression matrix and Y is the measured gene expression in X spatial masked for calibration genes.We set E ð0Þ ¼ 0 for all target genes.To select a value for α r , we used 5-fold cross-validation where we held out different subsets of calibration genes by setting E ð0Þ ¼ 0 for those genes, performed Reinforce until convergence, and then computed the mean absolute error (MAE) of the propagated residuals with respect to the ground truth residuals of the held-out calibration genes.We selected the α r corresponding to the lowest average MAE across all crossvalidation folds.We performed this search over 10 uniformly spaced values of α r ranging from 0.01 to 0.9.Using the predicted spatial gene expression as input, we iterate the Reinforce update rule until convergence when E ðtþ1Þ ¼ E ðtÞ .Across all benchmark spatial transcriptomics datasets, the Reinforce step generally converged in under 100 iterations.
The Reinforce step propagates residuals along a gene correlation network, defined by the normalized adjacency matrix S gene .We build the gene correlation network according to the following procedure: (i) calculate pairwise Spearman rank correlation between predicted expression of all genes in the combined target and calibration gene sets (r genes in total); (ii) automatically compute the highest threshold Spearman rank correlation for drawing edges between genes such that all genes have at least one neighbor (i.e. the graph is connected); (iii) apply the threshold to build an adjacency matrix with binary values indicating existence of edges between nodes.The adjacency matrix is then normalized by dividing the elements of each row by the row sum to create S gene 2 R r×r .In this procedure, the thresholds are specific to each spatial transcriptomics dataset and selected to ensure that information can be propagated across the full network.Across all benchmark spatial transcriptomics datasets, the automatic thresholds for the Spearman rank correlation generally ranged from 0.05 to 0.4 for Harmony-kNN, from 0.15 to 0.55 for SpaGE, and from 0.5 to 0.9 for Tangram.
In the final step of the Reinforce step, after convergence of Equation (1) has been reached, we compute the reinforced spatial gene expression prediction by adding the reinforced residual matrix E ðtÞ to the initial predictions: G ð0Þ ¼ GþE ðtÞ .

Smooth step with spatial neighborhood graph
In the Smooth step, we use a similar iterative smoothing procedure as in the Reinforce step.Specifically, the update rule for Smooth is: (2) In Equation ( 2), G ð0Þ is the reinforced spatial gene expression prediction matrix, G ðtÞ is the spatial gene expression matrix after the tth update, S spatial is a normalized adjacency matrix from a spatial neighborhood graph of the cells in the spatial transcriptomics data, and α s is the smoothing parameter for Smooth.The input to the Smooth step is the output of the Reinforce step, G ð0Þ ¼ GþE ðtÞ .For the ablated SPRITE model without the Reinforce step (No Reinforce), we set G ð0Þ ¼ G.To select the optimal α s , we performed a line search across 10 uniformly spaced values of α s ranging from 0.01 to 0.9, computing the MAE between the smoothed predictions outputted by the Smooth step and the measured expression for all calibration genes.We selected the value of α s that minimized this MAE measure.Using the reinforced spatial gene expression predictions as input, we iterate the Smooth update rule until convergence when G ðtþ1Þ ¼ G ðtÞ .Across all benchmark spatial transcriptomics datasets, the Smooth step generally converged in under 100 iterations.
To construct spatial neighborhood graphs with nodes corresponding to the cells in X spatial , we drew edges between each cell and its k nearest neighbors according to Euclidean distance between the centroids of each cell.We set k ¼ 50 for all analyses presented in this study, but also observed that the performance of SPRITE was generally robust to most reasonable choices of k (ranging from k ¼ 10 through k ¼ 100 for some datasets).To remove outlier edges that span exceptionally large distances, we removed all edges with distance >1.5 times the interquartile range of all edge distances.This resulted in a binary adjacency matrix representing the edges in the spatial neighborhood graph.Importantly, propagation of predicted gene expression across cells of different cell types or subtypes is not desirable.To selectively propagate information to cells with similar transcriptomic profiles (i.e.similar cell type), we weighted each edge by the similarity in measured ground truth expression of all genes in the spatial transcriptomics data between neighboring cells.To reduce distortions from high-dimensional vectors, we first applied principal component analysis to the measured gene expression and used the top five principal components to compute the cosine similarity for weighting each edge.We then normalized the weighted adjacency matrix by dividing each row by its sum to generate S spatial .Finally, the predicted gene expression values are propagated across the spatial neighborhood graph according to Equation (2) until convergence.The output of the Smooth step G ðtÞ is the SPRITE predicted spatial gene expression matrix.

Improvement in prediction
To evaluate SPRITE across the benchmark datasets and spatial gene expression prediction methods, we employed several statistical metrics to reflect performance in prediction quality.For a given gene, we measured the Pearson correlation coefficient (PCC) and the mean absolute error (MAE) between its baseline predicted expression and its measured expression across all cells in the spatial transcriptomics data.For the same gene, we also measured the PCC and MAE for the SPRITE predicted expression and the measured expression.To compute the improvement in performance provided by SPRITE, we computed the difference of each measure between the baseline predicted expression and the SPRITE predicted expression.The difference was computed such that

Cell clustering and evaluation
We performed clustering of cells using the Leiden algorithm (Traag et al. 2019).First, we set all negative values to zero in the gene expression matrix, normalized and log-transformed the counts, and performed principal component analysis (PCA).Then, we performed Leiden clustering using the default settings in the Scanpy package (Wolf et al. 2018) (version 1.9.3).
For the clustering experiments, we used the three spatial transcriptomics datasets that had publicly available cell type annotations, which included the mouse somatosensory osmFISH dataset, the mouse gastrulation seqFISH dataset, and the axolotl telencephalon Stereo-seq dataset.For the mouse somatosensory osmFISH dataset, we used cell type annotations under "ClusterName" from the metadata available at http://linnarsson lab.org/osmFISH/osmFISH_SScortex_mouse_all_cells.loom.For the mouse gastrulation seqFISH dataset, we used cell type annotations under "cell type mapped refined" from the metadata available at https://content.cruk.cam.ac.uk/jmlab/SpatialMouseAtlas2020/ in "metadata.Rds" file for "embryo1" and "z5."For the axolotl telencephalon Stereo-seq dataset, we retrieved cell type annotations under "Annotation" from the metadata available at https://db.cngb.org/stomics/artista/for the "Stage44.h5ad"object file.
To evaluate the quality of cell clusters, we used the adjusted Rand index to compare cluster assignments obtained on either baseline predicted spatial gene expression or SPRITE predicted spatial gene expression to the ground truth cell type labels.

Cell visualization and evaluation
We generated 2D visualizations of cells in spatial transcriptomics data using the UMAP (McInnes et al. 2018), t-SNE (van der Maaten and Hinton 2008) with PCA initialization, and PCA algorithms applied to different versions of the gene expression matrix.The gene expression matrices were standardized by features before input to the visualization algorithms.We performed visualization with all three algorithms on all eleven benchmark datasets.
To evaluate the quality of these visualizations, we computed the Spearman visualization score, which is the Spearman rank correlation coefficient between the pairwise Euclidean distances of cells in the visualization coordinates and in the original high-dimensional data.The metric was computed using the concordance score method in DynamicViz (Sun et al. 2023).

Cell type classification and evaluation
We used L1-penalized logistic regression models and trained them on different versions of the gene expression matrix to predict cell type labels (using up to the three most prevalent cell types).The models were evaluated under stratified 5-fold cross-validation using the cell type labels for stratification.To evaluate the quality of the classifiers, we computed the accuracy, macro-averaged F1 score, and the area under the receiver operating characteristic curve (AUC-ROC).These metrics were averaged across all cross-validation folds.

SPRITE improves prediction of spatial gene expression
We developed SPRITE, a meta-algorithm that post-processes predicted spatial gene expression by propagating prediction errors across a gene correlation network (Reinforce) and smooths predictions across a spatial neighborhood graph (Smooth) (Fig. 1).To evaluate whether SPRITE improved the quality of spatial gene expression predictions, we applied SPRITE to spatial gene expression predictions generated from three methods (SpaGE, Tangram, Harmony-kNN) and across eleven paired spatial transcriptomics and RNAseq dataset benchmarks.Relative to the baseline predicted spatial gene expression, SPRITE predicted spatial gene expression was generally better correlated and had lower mean absolute error with respect to the measured ground truth expression (Fig. 2A and B).Interestingly, for Tangram, SPRITE produced slightly negative improvement under the correlation metric but large positive improvement under the error metric (Fig. 2A and B).This trend was due to the tendency of Tangram to predict extreme values for gene expression, which can skew correlation measures but also provides opportunity for large reductions in prediction error.In aggregate and across all evaluation contexts (i.e.unique combination of dataset and prediction method), we observed improvements in at least one of the aforementioned performance metrics for prediction quality (Fig. 2C).Moreover, most genes also observed improvements in prediction for at least one of the performance metrics (Fig. 2C).In many cases, the improvement in gene expression prediction when using SPRITE was visually striking.For example, SPRITE can recover spatial expression patterns that were missing in the baseline predictions (Fig. 2D).Conversely, SPRITE can also attenuate high baseline predictions to better capture selective spatial expression of certain genes (Fig. 2E).In summary, SPRITE provides general improvements in the quality of predicted spatial gene expression across different datasets and prediction methods.

Reinforce and smooth steps are necessary for improvements by SPRITE
Given the improvement in predicted gene expression provided by SPRITE, we investigated the relative contributions of the Reinforce and Smooth steps to the observed improvement.To do so, we developed SPRITE implementations where one of the two steps was ablated from the pipeline (i.e."No Reinforce" and "No Smooth").After ablating these steps, we performed identical experiments as before to measure the improvement in prediction provided through the modified versions of SPRITE.We observed that both Smooth and Reinforce can provide improvements over baseline prediction when used in isolation, but that additional improvements can be achieved when they are combined such as in SPRITE (Fig. 3A and B).This result suggests that Reinforce and Smooth are synergistic in the SPRITE algorithm.

SPRITE improves clustering of cells by cell type
Of particular interest is whether the improvements in prediction quality by SPRITE can be transferred to common downstream analysis tasks such as the clustering of cells.Clustering is often used to identify cell types or subtypes in dimensional visualizations of the measured ground truth, baseline predicted, and SPRITE predicted spatial gene expression.We visualized the mouse somatosensory osmFISH dataset using UMAP on the baseline and SPRITE predicted expression, and the visualizations with SPRITE showed markedly better separation of cells by their anatomic region, which is desirable since the tissue is highly structured (Fig. 4C).More broadly, we observed that visualizations (UMAP, t-SNE, and PCA) generated on SPRITE predicted spatial gene expression were consistently better than visualizations obtained from baseline predicted spatial gene expression at preserving the pairwise distances between cells with respect to the original high-dimensional data and were often comparable in quality to visualizations generated from the measured ground truth gene expression (Fig. 4D).As such, visualizations of SPRITE predicted expression can more faithfully represent the spatial location of cells and their relations in the spatial transcriptomics data.

SPRITE improves cell type classification
Finally, we explored whether SPRITE would improve underlying biological signal in the predicted expression that could be detected by models trained and evaluated on the data.Specifically, we compared the performance of cell type classifiers trained on either baseline predicted expression or on SPRITE predicted expression (Fig. 5A).Across the three spatial transcriptomics datasets with cell type annotations and evaluated on three different classification metrics, we observed consistent and statistically significant improvements in classification performance when models are trained on the SPRITE predicted expression (Fig. 5B-D).In several cases, the performance of the SPRITE-based model can even exceed that of models trained on the measured expression (Fig. 5B-D).

Discussion
SPRITE is a meta-algorithm that functions with any existing or future spatial gene expression prediction method.We show that SPRITE generally increases the quality of spatial gene expression predictions, and that this improvement is the combined result of the Reinforce and Smooth steps in the SPRITE pipeline.Furthermore, we show that SPRITE can be extended to improve common downstream analysis tasks such as clustering of cells, faithful visualization of spatial transcriptomics, and classification of cells by cell type.Surprisingly, in several tasks, SPRITE not only outperforms baseline methods but can exceed approaches that use ground truth measured expression, suggesting that in addition to improved spatial gene expression prediction, SPRITE may also confer other advantages perhaps through de-noising gene expression and removing outliers.Ultimately, we believe that SPRITE will be a valuable tool in leveraging spatial transcriptomics to uncover new biological processes.Although other meta-algorithms have been developed for predicted spatial gene expression (Sun et al. 2024), SPRITE is the first that can modify the predicted gene expression values and thus can be readily extended into downstream applications.SPRITE is highly scalable since its computational complexity is related by a multiplicative constant to that of the underlying prediction method responsible for generating the baseline predicted gene expression.This constant is determined by the number of cross-validation folds for calibration gene expression prediction and can therefore be adjusted for different contexts and desired runtimes.The SPRITE runtime is generally dominated by the generation of baseline spatial gene expression prediction.For example, the median singlethreaded runtimes for SPRITE across the benchmark datasets were 242 s for generating 10-fold spatial gene expression predictions, 13 s for Reinforce, and 33 s for Smooth.Since our ablation studies showed that both Reinforce and Smooth can independently provide improvements over baseline predictions, in cases demanding particularly high computational efficiency, SPRITE (No Reinforce) may be a desirable alternative to the full SPRITE algorithm since the Smooth step does not require generating multiple cross-validation folds for predicting calibration gene expression.
An open question is whether SPRITE can still improve predicted gene expression if the underlying baseline prediction method explicitly uses spatial information and gene correlation network information.Currently, no baseline spatial gene expression prediction method meets both criteria.Similarly, it would be worthwhile to design end-to-end prediction methods that use spatial and gene correlation information and compare the performance of these methods to the multi-step approach used by SPRITE.Other formulations of the gene correlation network and spatial neighborhood graph, particularly in leveraging existing metadata such as cell type labels or in weighting connections between nodes, may provide useful directions for improving the SPRITE algorithm.As spatial transcriptomics data becomes available for more species, tissues, and biological conditions, it will be necessary to verify that the assumptions of SPRITE can also generalize to these new contexts.
Although we have considered several common spatial transcriptomics analysis tasks, exciting new approaches are emerging in this field for investigating cellular interactions and characterizing local neighborhood effects (Moses and Pachter 2022).Testing the ability of SPRITE to yield improvements on these additional frameworks would further extend the range of applications for SPRITE and spatial gene expression prediction.In addition, combining SPRITE with new methods for estimating spatial gene expression prediction uncertainty (Sun et al. 2024) may provide an ecosystem for interpreting conclusions drawn from predicted spatial gene expression and performing useful scientific inference.Although we have only evaluated SPRITE on spatially resolved single-cell transcriptomics, the algorithm might also be applicable to other spatial data modalities such as spatial proteomics and spot-based spatial transcriptomics.

i524
Sun et al.

Figure 4 .
Figure 4. SPRITE predicted spatial gene expression provides more faithful clustering and visualization of cells.(A) Illustration of evaluation scheme where cells are represented by clustering and visualization based on either baseline predicted spatial transcriptomes or SPRITE-predicted transcriptomes.(B) Cell clustering quality measured by the adjusted Rand index with cell type as the ground truth label on all three spatial transcriptomics datasets that included cell type annotations.Clustering used the Leiden algorithm on either baseline predicted spatial gene expression or SPRITE predicted spatial gene expression.(C) Visualization of all cells in the osmFISH mouse somatosensory cortex dataset using UMAP projections of either the baseline predicted spatial gene expression (top left) or the SPRITE predicted spatial gene expression (top right) along with a cutout showing the spatial location of all cells (bottom).Cells are colored by their anatomic region labels.(D) Spearman visualization scores indicating preservation of pairwise cell distances in 2D visualization with respect to the original high-dimensional data using either the measured gene expression, baseline predicted gene expression, or SPRITE predicted gene expression profiles for visualization.Results are shown for eleven spatial transcriptomics datasets and aggregated over three spatial gene expression prediction methods (SpaGE, Tangram, Harmony-kNN) and three dimensionality reduction methods for visualization (UMAP, t-SNE, PCA).
Illustration of the steps in the SPRITE algorithm.Baseline spatial gene expression predictions are made from any available prediction methods.SPRITE builds a gene correlation network for the Reinforce step to propagate prediction errors from calibration genes to unseen genes.SPRITE builds a spatial graph of neighboring cells for the Smooth step to propagate predicted expression across neighbors.The SPRITE-processed predicted expression can be used in downstream analysis tasks.